Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU accelerated encoder #895

Merged
merged 55 commits into from
Dec 3, 2024
Merged

GPU accelerated encoder #895

merged 55 commits into from
Dec 3, 2024

Conversation

dmanc
Copy link
Contributor

@dmanc dmanc commented Nov 14, 2024

Why are these changes needed?

This PR adds support for multiple backends for the encoder. Initially we only have the gnark and icicle backends, see https://dev.ingonyama.com/icicle/overview for more information on the icicle backend.

A backend may include GPU acceleration as an option, see https://dev.ingonyama.com/icicle/install_cuda_backend for icicle's GPU accelerated backend.

In order to use the icicle backend it must be compiled with the icicle build tag:

go build -tags=icicle main.go

Checks

  • I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
  • I've checked the new test coverage and the coverage percentage didn't drop.
  • Testing Strategy
    • Unit tests
    • Integration tests
    • This PR is not tested :(

@dmanc dmanc requested review from bxue-l2 and jianoaix November 14, 2024 08:31
Copy link
Contributor

@samlaf samlaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes lost all my comments while reviewing on github.dev.... friggin github. Some graphql internal error or something. So I'll just submit this first set of reviews in case. Don't really feel like reviewing again lol... another reason why small PRs are gud, big PRs bad.

api/clients/retrieval_client_test.go Outdated Show resolved Hide resolved
core/test/core_test.go Outdated Show resolved Hide resolved
disperser/cmd/encoder/flags/flags.go Outdated Show resolved Hide resolved
disperser/cmd/encoder/config.go Outdated Show resolved Hide resolved
Copy link
Contributor

@samlaf samlaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just spent 2 hours reviewing and I barely even got to the important stuff. Suggest no more PRs like this, for the sake of our industry, and of your reviewers' mental health. :D

docker-bake.hcl Outdated Show resolved Hide resolved
encoding/bench/Makefile Show resolved Hide resolved
Comment on lines 17 to 23
type IcicleDevice struct {
Device icicle_runtime.Device
NttCfg core.NTTConfig[[icicle_bn254.SCALAR_LIMBS]uint32]
MsmCfg core.MSMConfig
FlatFFTPointsT []icicle_bn254.Affine
SRSG1Icicle []icicle_bn254.Affine
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the point that we can have a different ntt/msm config and points loaded on different devices (gpus?)? Do we actually make use of this flexibility? If so, please add comment explaining.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No there's only going to be a single configuration loaded at startup time. Right now we don't support multiple GPUs for an encoder.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this mentioned anywhere? Would be good to if not.

encoding/icicle/device_setup.go Show resolved Hide resolved
Comment on lines 58 to 67
setupErr = fmt.Errorf("could not setup NTT")
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just return the setupErr directly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also prob want to wrap the icicleErr to help debug

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cannot return because I don't think that RunOnDevice func will let me. Looks like this:

func RunOnDevice(device *Device, funcToRun func(args ...any), args ...any) {
    go func(deviceToRunOn *Device) {
        defer runtime.UnlockOSThread()
        runtime.LockOSThread()
        originalDevice, _ := GetActiveDevice()
        SetDevice(deviceToRunOn)
        funcToRun(args...)
        SetDevice(originalDevice)
    }(device)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah my bad I completely missed the function signature.
I see so the code passed to RunOnDevice is run on the gpu...? Did you test that your shared pointer for setupErr is working then? This would mean the gpu is able to go write in the cpu's RAM. Does this only work on mac where cpu/gpu share memory? Or is this something that always works?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, RunOnDevice locks the OS thread so it makes all of that code run on the same OS thread. It's because with go routines you can call SetDevice to set the GPU device and that can run on thread 1 but the go routine can end up switching to thread 2 in the middle which means that it will use the default device (CPU).

encoding/icicle/device_setup.go Outdated Show resolved Hide resolved
encoding/icicle/msm_setup.go Outdated Show resolved Hide resolved
encoding/icicle/ntt_setup.go Outdated Show resolved Hide resolved
Copy link
Contributor

@jianoaix jianoaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a diff-neutral PR just for refactoring? It seems there are code here which doesn't need deep review as they are just existing code moved around?

encoding/rs/serialization.go Outdated Show resolved Hide resolved
encoding/rs/noicicle.go Outdated Show resolved Hide resolved
encoding/rs/serialization.go Outdated Show resolved Hide resolved
@dmanc
Copy link
Contributor Author

dmanc commented Nov 15, 2024

Can we have a diff-neutral PR just for refactoring? It seems there are code here which doesn't need deep review as they are just existing code moved around?

Yes, if we're happy with this new configuration method then I'll make another PR with that and rebase this one

RUN go build -tags=icicle -o ./bin/server ./cmd/encoder

# Start a new stage for the base image
FROM nvidia/cuda:12.2.2-base-ubuntu22.04
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image size comparison:

ghcr.io/layr-labs/eigenda/encoder latest 781c3866c5dd 6 hours ago 41MB
6129846e8150 6 hours ago 41MB
ghcr.io/layr-labs/eigenda/encoder-icicle latest f530fe9c250d 21 hours ago 760MB

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what makes it so large? near 20x is a lot

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't look into but my guess right now would be the cuda base image which contains the cuda runtime.

disperser/cmd/encoder/main.go Outdated Show resolved Hide resolved

func CreateIcicleBackendProver(p *Prover, params encoding.EncodingParams, fs *fft.FFTSettings, ks *kzg.KZGSettings) (*ParametrizedProver, error) {
// Not supported
return nil, fmt.Errorf("icicle backend called without icicle build tag")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of noicicle.go file is that the icicle backend should only be functional when the icicle build tag is used. This is so we don't need to include the icicle backend files in the traditional encoder backend, otherwise it may break the existing build.

}

// CommitmentDevice represents a backend capable of computing various KZG commitments.
type KzgCommitmentsBackend interface {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved commitments to its own backend in case we would like to accelerate them in the future.

@@ -331,3 +457,99 @@ func toUint64Array(chunkIndices []encoding.ChunkNumber) []uint64 {
}
return res
}

func (p *Prover) newProver(params encoding.EncodingParams) (*ParametrizedProver, error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file should be reviewed carefully

)

// VerifierOption defines a function that configures a Verifier
type VerifierOption func(*Verifier) error
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can go in it's own PR, it's applying the same options pattern refactor to verifier so not super relevant to GPU encoder

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks more verbose and is deviating from the convention, what're the main problems it solves here compared to passing in a config struct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One argument I would give for this configuration pattern is that it provides a simpler constructor that is extensible and backwards compatible.

For example, with a config struct we would have to modify every client that calls the constructor when we add a new configuration value vs. this pattern you add a WithX function in the library that sets the field and does whatever validation it needs to. Then clients can optionally update to use that new added configuration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with a config struct we would have to modify every client that calls the constructor when we add a new configuration value

Not sure about this, if that new field is optional, the caller doesn't need to do anything. If it's required, callers will have to make changes in either case.

My concerns are the boilplate and the scalabilty of keeping adding WithXX, we may have a long list of fields (like in some big structs).
Also it looks it still need to operate on config struct, if the validation runs across multiple fields (eg. invariants defined over multiple fields).

Copy link
Contributor Author

@dmanc dmanc Nov 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concerns are the boilplate and the scalabilty of keeping adding WithXX, we may have a long list of fields (like in some big structs).

These seem like tractable problems though, here is an example in grpc library:
https://github.com/grpc/grpc-go/blob/66385b28b3fe71a4895f00d581ede0a344743a3f/dialoptions.go#L248

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think I agree with both of you. I've grown fond of the options pattern and like it best, but I still think consistency across a codebase is the most important thing, so kind of agree with jian about this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, I'll revert to the previous method

"github.com/consensys/gnark-crypto/ecc/bn254/fr"
)

type ParametrizedEncoder struct {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor to make the encoder parametrized based on FFT settings (similar to Prover). Don't think it's too valuable since max time is ~600ms for loading largest fft setting. Am thinking of reverting

@@ -0,0 +1,103 @@
//go:build icicle

package icicle
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This folder is icicle backend related setup

@@ -0,0 +1,46 @@
//go:build icicle

package icicle
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This folder is icicle related computations for MultiProofs

"github.com/ingonyama-zk/icicle/v3/wrappers/golang/runtime"
)

type RsIcicleComputeDevice struct {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is icicle related computations for RS Encode

@dmanc
Copy link
Contributor Author

dmanc commented Nov 15, 2024

One thing I'm not too sure how to deal with is testing Icicle/ GPU encoder correctness. The only strategy I can think of is to compare output of default backend and icicle backend.

Also if we wanted it to be part of CI, then we need to acquire GPU based github action runners.

encoding/rs/icicle.go Outdated Show resolved Hide resolved

func CreateIcicleBackendEncoder(e *Encoder, params encoding.EncodingParams, fs *fft.FFTSettings) (*ParametrizedEncoder, error) {
icicleDevice, err := icicle.NewIcicleDevice(icicle.IcicleDeviceConfig{
GPUEnable: e.Config.GPUEnable,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks Icicle will support also CPU now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but it's not as performant as the gnark backend, am trying to look into why


var evals []fr.Element
g.NttCfg.BatchSize = int32(1)
runtime.RunOnDevice(&g.Device, func(args ...any) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the GPU is enabled, then this func will be executed on GPU by the runtime?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes because the device variable will be set to the GPU device here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For performance, it looks the main data movement to/from GPU will be the coefficients and the resulting evaluations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For RSEncode yes:

For multiproofs it is also the same (move coefficients, move out proofs) but in addition we need to also transfer SRS Table to compute the MSM. This can be done at SetupMSM at initialization but not sure how much speedup that is.

)

// VerifierOption defines a function that configures a Verifier
type VerifierOption func(*Verifier) error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks more verbose and is deviating from the convention, what're the main problems it solves here compared to passing in a config struct?

Fs: fs,
verbose: verbose,
NumRSWorker: runtime.GOMAXPROCS(0),
return &ParametrizedEncoder{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So when you say "backend", it looks a software/library that computes RS encoding? I.e. it decoupled device (CPU/GPU) and backend (the software that executes compute on device)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now I consider backend as library that implements the cryptographic primitive like NTT, MSM. We have our own implementation of NTT borrowed from protolambda and MSM from gnark.

It could be plausible that there exists libraries that implement the higher level primitives like multiproofs, commitment, RS encode and we can try to make our code more of a frontend that calls those. But for now we focus on the core primitives that give the speedup factor.

encoding/rs/extend_poly.go Outdated Show resolved Hide resolved
@dmanc dmanc marked this pull request as ready for review November 16, 2024 00:03
@dmanc dmanc force-pushed the gpu-encoder branch 3 times, most recently from d716e07 to 51b0403 Compare November 19, 2024 00:12
@mooselumph mooselumph self-requested a review November 19, 2024 19:14
Srs *kzg.SRS
G2Trailing []bn254.G2Affine
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to put a comment, describing e(ComputeCommitment, g2_shifted) = e(g1, ComputeLengthCommitment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should have been
e(g1_shifted, ComputeLengthCommitment) = e(g1, ComputeLengthProof)
e(ComputeCommitment, g2) = e(g1, ComputeLengthCommitment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to do documentation update in a separate PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For documentation, I think this one appears to be quite readable: https://github.com/crate-crypto/go-kzg-4844/blob/master/internal/kzg/kzg_prove.go, which we can take some inspiration from

@@ -162,7 +162,8 @@ func RunDisperserServer(ctx *cli.Context) error {
bucketName := config.BlobstoreConfig.BucketName
logger.Info("Blob store", "bucket", bucketName)
if config.DisperserVersion == V2 {
prover, err := prover.NewProver(&config.EncodingConfig, true)
config.EncodingConfig.LoadG2Points = true
prover, err := prover.NewProver(&config.EncodingConfig, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the first arg is kzgConfig, according to the func. Name issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's acceptable

encoding/kzg/prover/icicle.go Show resolved Hide resolved
return nil, nil, fmt.Errorf("failed to create SRS table: %w", err)
}

fftPoints, err := subTable.GetSubTables(params.NumChunks, params.ChunkLength)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we ever need fftPoints, chance for optimization

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will include this in followup optimizations

jobChan := make(chan JobRequest, numWorker)
results := make(chan error, numWorker)

for w := uint64(0); w < numWorker; w++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try workerpool?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will include in followup. Wrote ticket

Copy link
Contributor

@jianoaix jianoaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG mostly, a bunch of small comments

RUN go build -tags=icicle -o ./bin/server ./cmd/encoder

# Start a new stage for the base image
FROM nvidia/cuda:12.2.2-base-ubuntu22.04
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what makes it so large? near 20x is a lot

encodingConfig := &encoding.Config{
BackendType: backendType,
GPUEnable: config.ServerConfig.GPUEnable,
NumWorker: config.EncoderConfig.NumWorker,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have understanding about how this parallelism interacts with the blob level parallelism at the server?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding the interaction is that it can introduce CPU contention if we are handling more than 1 blob concurrently. In the worst case we set NumWorkers = # CPU on machine and two blobs are doing computations that require all workers.

}

// Setup MSM if parameters are provided
if config.FFTPointsT != nil && config.SRSG1 != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be validated (right after entering NewIcicleDevice) for properties of SRSG1 and FFPointsT? Or they are validated at the caller?

device := runtime.CreateDevice("CPU", 0)
if runtime.IsDeviceAvailable(&device) {
slog.Info("CPU device available, setting device")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to assume the CreateDevice never fails for CPU?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that's the case but not sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be safer to return an error and bubble up

}

slog.Info("CUDA device not available, falling back to CPU")
return setupCPUDevice()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GPUEnable is instructed by the cmdline flag, should this not fallback? Or comment in the flag that it'll fallback if there is no GPU.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment to flag

@@ -13,66 +14,18 @@ import (
"github.com/consensys/gnark-crypto/ecc/bn254/fr"
)

type KzgCpuProofDevice struct {
type KzgMultiProofGnarkBackend struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we create a gnark folder similar to icicle to organize these implementations with gnark?


sumVec, err := p.MsmBatchOnDevice(flattenStoreCopyToDevice, p.FlatFFTPointsT, int(numPoly)*int(dimE)*2)
if err != nil {
icicleErr = fmt.Errorf("msm error: %w", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here and below: why not return on such an error? The proof computing is already failed at these points

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#895 (comment)

Essentially just makes it so we don't have to compile icicle libraries unless we explicitly add the build tag.


func (p *Prover) createGnarkBackendProver(params encoding.EncodingParams, fs *fft.FFTSettings, ks *kzg.KZGSettings) (*ParametrizedProver, error) {
if p.Config.GPUEnable {
return nil, fmt.Errorf("GPU is not supported in default backend")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "default" -> "gnark" (some occurrences below as well)

@@ -14,17 +14,21 @@ import (
)

func TestBatchEquivalence(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll be nice to have a icicle-gnark equivalence test for proofs, but may need gpu setup for unit test... Maybe we can start with CPU (for both), in a follow up

Copy link
Contributor

@jianoaix jianoaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly lg. There were a couple comments before would be nice to address

device := runtime.CreateDevice("CPU", 0)
if runtime.IsDeviceAvailable(&device) {
slog.Info("CPU device available, setting device")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be safer to return an error and bubble up

preprocessDone := time.Now()

// Prepare data before GPU operations
flattenCoeffStoreFr := make([]fr.Element, len(coeffStore)*len(coeffStore[0]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The computeCoeffStore may create a 1D result directly then this flattening can be saved


// Proof device represents a backend capable of computing KZG multiproofs.
type KzgMultiProofsBackend interface {
ComputeMultiFrameProof(blobFr []fr.Element, numChunks, chunkLen, numWorker uint64) ([]bn254.G1Affine, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the blobFr are points on the blob's polynomial, but why doesn't this produce a single bn254.G1Affine (which is the proof)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blobFr are the coefficients of the polynomial. We actually have some documentation about it here https://github.com/Layr-Labs/eigenda/blob/master/docs/spec/attestation/amortized-proving.md#amortized-multireveal-proof-generation-with-the-fft.

These two resources were good for understanding the implementation details (although I still don't fully grasp it yet):
https://alinush.github.io/2021/06/17/Feist-Khovratovich-technique-for-computing-KZG-proofs-fast.html
https://alinush.github.io/2020/03/19/multiplying-a-vector-by-a-toeplitz-matrix.html

// Encoding Reed Solomon using FFT
func (g *RsIcicleBackend) ExtendPolyEval(coeffs []fr.Element) ([]fr.Element, error) {
// Lock the GPU for operations
g.GpuLock.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's not using GPU (which is possible for icicle), then wouldn't this lock hurt performance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The perf is already quite poor on the CPU version of icicle. I would prefer to tackle that separately if that's okay.

Copy link
Contributor

@jianoaix jianoaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, well done!

@dmanc dmanc merged commit a84eff9 into Layr-Labs:master Dec 3, 2024
6 checks passed
hopeyen pushed a commit that referenced this pull request Dec 5, 2024
hopeyen pushed a commit that referenced this pull request Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants